7 research outputs found
A Robust Approach Towards Distinguishing Natural and Computer Generated Images using Multi-Colorspace fused and Enriched Vision Transformer
The works in literature classifying natural and computer generated images are
mostly designed as binary tasks either considering natural images versus
computer graphics images only or natural images versus GAN generated images
only, but not natural images versus both classes of the generated images. Also,
even though this forensic classification task of distinguishing natural and
computer generated images gets the support of the new convolutional neural
networks and transformer based architectures that can give remarkable
classification accuracies, they are seen to fail over the images that have
undergone some post-processing operations usually performed to deceive the
forensic algorithms, such as JPEG compression, gaussian noise, etc. This work
proposes a robust approach towards distinguishing natural and computer
generated images including both, computer graphics and GAN generated images
using a fusion of two vision transformers where each of the transformer
networks operates in different color spaces, one in RGB and the other in YCbCr
color space. The proposed approach achieves high performance gain when compared
to a set of baselines, and also achieves higher robustness and generalizability
than the baselines. The features of the proposed model when visualized are seen
to obtain higher separability for the classes than the input image features and
the baseline features. This work also studies the attention map visualizations
of the networks of the fused model and observes that the proposed methodology
can capture more image information relevant to the forensic task of classifying
natural and generated images
Exploring Fairness in Pre-trained Visual Transformer based Natural and GAN Generated Image Detection Systems and Understanding the Impact of Image Compression in Fairness
It is not only sufficient to construct computational models that can
accurately classify or detect fake images from real images taken from a camera,
but it is also important to ensure whether these computational models are fair
enough or produce biased outcomes that can eventually harm certain social
groups or cause serious security threats. Exploring fairness in forensic
algorithms is an initial step towards correcting these biases. Since visual
transformers are recently being widely used in most image classification based
tasks due to their capability to produce high accuracies, this study tries to
explore bias in the transformer based image forensic algorithms that classify
natural and GAN generated images. By procuring a bias evaluation corpora, this
study analyzes bias in gender, racial, affective, and intersectional domains
using a wide set of individual and pairwise bias evaluation measures. As the
generalizability of the algorithms against image compression is an important
factor to be considered in forensic tasks, this study also analyzes the role of
image compression on model bias. Hence to study the impact of image compression
on model bias, a two phase evaluation setting is followed, where a set of
experiments is carried out in the uncompressed evaluation setting and the other
in the compressed evaluation setting
Distinguishing Natural and Computer-Generated Images using Multi-Colorspace fused EfficientNet
The problem of distinguishing natural images from photo-realistic
computer-generated ones either addresses natural images versus computer
graphics or natural images versus GAN images, at a time. But in a real-world
image forensic scenario, it is highly essential to consider all categories of
image generation, since in most cases image generation is unknown. We, for the
first time, to our best knowledge, approach the problem of distinguishing
natural images from photo-realistic computer-generated images as a three-class
classification task classifying natural, computer graphics, and GAN images. For
the task, we propose a Multi-Colorspace fused EfficientNet model by parallelly
fusing three EfficientNet networks that follow transfer learning methodology
where each network operates in different colorspaces, RGB, LCH, and HSV, chosen
after analyzing the efficacy of various colorspace transformations in this
image forensics problem. Our model outperforms the baselines in terms of
accuracy, robustness towards post-processing, and generalizability towards
other datasets. We conduct psychophysics experiments to understand how
accurately humans can distinguish natural, computer graphics, and GAN images
where we could observe that humans find difficulty in classifying these images,
particularly the computer-generated images, indicating the necessity of
computational algorithms for the task. We also analyze the behavior of our
model through visual explanations to understand salient regions that contribute
to the model's decision making and compare with manual explanations provided by
human participants in the form of region markings, where we could observe
similarities in both the explanations indicating the powerful nature of our
model to take the decisions meaningfully.Comment: 13 page
Blacks is to Anger as Whites is to Joy? Understanding Latent Affective Bias in Large Pre-trained Neural Language Models
Groundbreaking inventions and highly significant performance improvements in
deep learning based Natural Language Processing are witnessed through the
development of transformer based large Pre-trained Language Models (PLMs). The
wide availability of unlabeled data within human generated data deluge along
with self-supervised learning strategy helps to accelerate the success of large
PLMs in language generation, language understanding, etc. But at the same time,
latent historical bias/unfairness in human minds towards a particular gender,
race, etc., encoded unintentionally/intentionally into the corpora harms and
questions the utility and efficacy of large PLMs in many real-world
applications, particularly for the protected groups. In this paper, we present
an extensive investigation towards understanding the existence of "Affective
Bias" in large PLMs to unveil any biased association of emotions such as anger,
fear, joy, etc., towards a particular gender, race or religion with respect to
the downstream task of textual emotion detection. We conduct our exploration of
affective bias from the very initial stage of corpus level affective bias
analysis by searching for imbalanced distribution of affective words within a
domain, in large scale corpora that are used to pre-train and fine-tune PLMs.
Later, to quantify affective bias in model predictions, we perform an extensive
set of class-based and intensity-based evaluations using various bias
evaluation corpora. Our results show the existence of statistically significant
affective bias in the PLM based emotion detection systems, indicating biased
association of certain emotions towards a particular gender, race, and
religion
REDAffectiveLM: Leveraging Affect Enriched Embedding and Transformer-based Neural Language Model for Readers' Emotion Detection
Technological advancements in web platforms allow people to express and share
emotions towards textual write-ups written and shared by others. This brings
about different interesting domains for analysis; emotion expressed by the
writer and emotion elicited from the readers. In this paper, we propose a novel
approach for Readers' Emotion Detection from short-text documents using a deep
learning model called REDAffectiveLM. Within state-of-the-art NLP tasks, it is
well understood that utilizing context-specific representations from
transformer-based pre-trained language models helps achieve improved
performance. Within this affective computing task, we explore how incorporating
affective information can further enhance performance. Towards this, we
leverage context-specific and affect enriched representations by using a
transformer-based pre-trained language model in tandem with affect enriched
Bi-LSTM+Attention. For empirical evaluation, we procure a new dataset REN-20k,
besides using RENh-4k and SemEval-2007. We evaluate the performance of our
REDAffectiveLM rigorously across these datasets, against a vast set of
state-of-the-art baselines, where our model consistently outperforms baselines
and obtains statistically significant results. Our results establish that
utilizing affect enriched representation along with context-specific
representation within a neural architecture can considerably enhance readers'
emotion detection. Since the impact of affect enrichment specifically in
readers' emotion detection isn't well explored, we conduct a detailed analysis
over affect enriched Bi-LSTM+Attention using qualitative and quantitative model
behavior evaluation techniques. We observe that compared to conventional
semantic embedding, affect enriched embedding increases ability of the network
to effectively identify and assign weightage to key terms responsible for
readers' emotion detection